Recognizing Command Words using Deep Recurrent Neural Network for Both Acoustic and Throat Speech

نویسندگان

چکیده

The importance of speech command recognition in a human-machine interaction system is increased recent years. In this study, we propose deep neural network-based for acoustic and throat recognition. We apply preprocessed pipeline to create the input learning model. Firstly, commands are decomposed into components using well-known signal decomposition techniques. Mel-frequency cepstral coefficients (MFCC) feature extraction method applied each component obtain inputs system. At stage, compare performance different techniques such as wavelet packet (WPD), continuous transform (CWT), empirical mode (EMD) order find out best technique our observe that WPD shows terms classification accuracy. This paper investigates long short-term memory (LSTM)-based recurrent network (RNN), which trained extracted MFCC features. proposed tested commands. Moreover, also train test model mic. well. Lastly, transfer employed increase accuracy weights with used initialize Overall, have found significant both speech. LSTM much better than GMM-HMM model, convolutional networks CNN-tpool2 residual res15 res26 an score over 97% on Google’s Speech Commands dataset achieve 95.35% data set technique.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Deep Recurrent Neural Network-Based Autoencoders for Acoustic Novelty Detection

In the emerging field of acoustic novelty detection, most research efforts are devoted to probabilistic approaches such as mixture models or state-space models. Only recent studies introduced (pseudo-)generative models for acoustic novelty detection with recurrent neural networks in the form of an autoencoder. In these approaches, auditory spectral features of the next short term frame are pred...

متن کامل

A Deep Neural Network for Acoustic-Articulatory Speech Inversion

In this work, we implement a deep belief network for the acoustic-articulatory inversion mapping problem. We find that adding up to 3 hidden-layers improves inversion accuracy. We also show that this improvement is due to the higher expressive capability of a deep model and not a consequence of adding more adjustable parameters. Additionally, we show unsupervised pretraining of the system impro...

متن کامل

Deep Recurrent Convolutional Neural Network: Improving Performance For Speech Recognition

A deep learning approach has been widely applied in sequence modeling problems. In terms of automatic speech recognition (ASR), its performance has significantly been improved by increasing large speech corpus and deeper neural network. Especially, recurrent neural network and deep convolutional neural network have been applied in ASR successfully. Given the arising problem of training speed, w...

متن کامل

Deep Recurrent Neural Networks for Acoustic Modelling

We present a novel deep Recurrent Neural Network (RNN) model for acoustic modelling in Automatic Speech Recognition (ASR). We term our contribution as a TC-DNN-BLSTM-DNN model, the model combines a Deep Neural Network (DNN) with Time Convolution (TC), followed by a Bidirectional LongShort Term Memory (BLSTM), and a final DNN. The first DNN acts as a feature processor to our model, the BLSTM the...

متن کامل

Tuning Recurrent Neural Networks for Recognizing Handwritten Arabic Words

Artificial neural networks have the abilities to learn by example and are capable of solving problems that are hard to solve using ordinary rule-based programming. They have many design parameters that affect their performance such as the number and sizes of the hidden layers. Large sizes are slow and small sizes are generally not accurate. Tuning the neural network size is a hard task because ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: European Journal of Information Technologies and Computer Science

سال: 2023

ISSN: ['2736-5492']

DOI: https://doi.org/10.24018/compute.2023.3.2.88